AITopics | software engineering conference and symposium

Collaborating Authors

software engineering conference and symposium

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FAIREDU: A Multiple Regression-Based Method for Enhancing Fairness in Machine Learning Models for Educational Applications

Pham, Nga, Do, Minh Kha, Dai, Tran Vu, Hung, Pham Ngoc, Nguyen-Duc, Anh

arXiv.org Artificial IntelligenceOct-8-2024

Fairness in artificial intelligence and machine learning (AI/ML) models is becoming critically important, especially as decisions made by these systems impact diverse groups. In education, a vital sector for all countries, the widespread application of AI/ML systems raises specific concerns regarding fairness. Current research predominantly focuses on fairness for individual sensitive features, which limits the comprehensiveness of fairness assessments. This paper introduces FAIREDU, a novel and effective method designed to improve fairness across multiple sensitive features. Through extensive experiments, we evaluate FAIREDU effectiveness in enhancing fairness without compromising model performance. The results demonstrate that FAIREDU addresses intersectionality across features such as gender, race, age, and other sensitive features, outperforming state-of-the-art methods with minimal effect on model accuracy. The paper also explores potential future research directions to enhance further the method robustness and applicability to various machine-learning models and datasets.

dataset, fairness, sensitive feature, (16 more...)

arXiv.org Artificial Intelligence

2410.06423

Country:

North America > United States > New York > New York County > New York City (0.05)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)
Research Report > Promising Solution (0.88)

Industry: Education > Educational Setting (1.00)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback

Testing and Evaluation of Large Language Models: Correctness, Non-Toxicity, and Fairness

Wang, Wenxuan

arXiv.org Artificial IntelligenceAug-31-2024

Large language models (LLMs), such as ChatGPT, have rapidly penetrated into people's work and daily lives over the past few years, due to their extraordinary conversational skills and intelligence. ChatGPT has become the fastest-growing software in terms of user numbers in human history and become an important foundational model for the next generation of artificial intelligence applications. However, the generations of LLMs are not entirely reliable, often producing content with factual errors, biases, and toxicity. Given their vast number of users and wide range of application scenarios, these unreliable responses can lead to many serious negative impacts. This thesis introduces the exploratory works in the field of language model reliability during the PhD study, focusing on the correctness, non-toxicity, and fairness of LLMs from both software testing and natural language processing perspectives. First, to measure the correctness of LLMs, we introduce two testing frameworks, FactChecker and LogicAsker, to evaluate factual knowledge and logical reasoning accuracy, respectively. Second, for the non-toxicity of LLMs, we introduce two works for red-teaming LLMs. Third, to evaluate the fairness of LLMs, we introduce two evaluation frameworks, BiasAsker and XCulturalBench, to measure the social bias and cultural bias of LLMs, respectively.

38th ieee acm international conference, commercial software and research model, software engineering conference and symposium, (15 more...)

arXiv.org Artificial Intelligence

2409.00551

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
Europe > United Kingdom (0.13)
Asia > Russia (0.13)
(26 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)
Workflow (0.92)

Industry:

Media (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automated Program Repair: Emerging trends pose and expose problems for benchmarks

Renzullo, Joseph, Reiter, Pemma, Weimer, Westley, Forrest, Stephanie

arXiv.org Artificial IntelligenceMay-8-2024

A variety of techniques have been developed, e.g., evolutionary computation[60, 133], methods incorporating templated mutation operators[71], semantic inference techniques[79] targeting single-cause defects, and methods designed to handle multi-hunk bugs[100]. Increasingly, researchers have applied ML-based methods to APR tasks (Section 3), but data leakage is a concern(Section 4). Each new technique, or modification of an existing technique, tends to be developed by an independent research team, without reference to a common, formal definition of APR. Benchmarks are not enough to standardize evaluation on their own (Section 5). As motivating examples, consider the following inconsistencies in the published literature: Correctness. VFix [123] identifies correct patches that pass all test cases and are semantically or syntactically equivalent to the original bug-fix, while VRepair[26] reports repair accuracy in terms of semantic equivalence to the original bug-fix, and SynFix [10] defines correctness simply as passing the test cases. Each of these is a reasonable definition, but collectively, their differences make it difficult to compare results.

international conference, program repair, software engineering, (15 more...)

arXiv.org Artificial Intelligence

2405.05455

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Oceania > Australia > Victoria > Melbourne (0.05)
North America > United States > Michigan > Oakland County > Rochester (0.04)
(19 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Systematic Literature Review on Explainability for Machine/Deep Learning-based Software Engineering Research

Cao, Sicong, Sun, Xiaobing, Widyasari, Ratnadira, Lo, David, Wu, Xiaoxue, Bo, Lili, Zhang, Jiale, Li, Bin, Liu, Wei, Wu, Di, Chen, Yixin

arXiv.org Artificial IntelligenceJan-25-2024

The remarkable achievements of Artificial Intelligence (AI) algorithms, particularly in Machine Learning (ML) and Deep Learning (DL), have fueled their extensive deployment across multiple sectors, including Software Engineering (SE). However, due to their black-box nature, these promising AI-driven SE models are still far from being deployed in practice. This lack of explainability poses unwanted risks for their applications in critical tasks, such as vulnerability detection, where decision-making transparency is of paramount importance. This paper endeavors to elucidate this interdisciplinary domain by presenting a systematic literature review of approaches that aim to improve the explainability of AI models within the context of SE. The review canvasses work appearing in the most prominent SE & AI conferences and journals, and spans 63 papers across 21 unique SE tasks. Based on three key Research Questions (RQs), we aim to (1) summarize the SE tasks where XAI techniques have shown success to date; (2) classify and analyze different XAI techniques; and (3) investigate existing evaluation approaches. Based on our findings, we identified a set of challenges remaining to be addressed in existing studies, together with a roadmap highlighting potential opportunities we deemed appropriate and important for future work.

explanation, proceedings, xai technique, (13 more...)

arXiv.org Artificial Intelligence

2401.14617

Country:

Asia > China (0.05)
Asia > Singapore (0.04)
Oceania > Australia > Queensland (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Large Language Models for Software Engineering: A Systematic Literature Review

Hou, Xinyi, Zhao, Yanjie, Liu, Yue, Yang, Zhou, Wang, Kailong, Li, Li, Luo, Xiapu, Lo, David, Grundy, John, Wang, Haoyu

arXiv.org Artificial IntelligenceSep-12-2023

Large Language Models (LLMs) have significantly impacted numerous domains, including Software Engineering (SE). Many recent publications have explored LLMs applied to various SE tasks. Nevertheless, a comprehensive understanding of the application, effects, and possible limitations of LLMs on SE is still in its early stages. To bridge this gap, we conducted a systematic literature review on LLM4SE, with a particular focus on understanding how LLMs can be exploited to optimize processes and outcomes. We collect and analyze 229 research papers from 2017 to 2023 to answer four key research questions (RQs). In RQ1, we categorize different LLMs that have been employed in SE tasks, characterizing their distinctive features and uses. In RQ2, we analyze the methods used in data collection, preprocessing, and application highlighting the role of well-curated datasets for successful LLM for SE implementation. RQ3 investigates the strategies employed to optimize and evaluate the performance of LLMs in SE. Finally, RQ4 examines the specific SE tasks where LLMs have shown success to date, illustrating their practical contributions to the field. From the answers to these RQs, we discuss the current state-of-the-art and trends, identifying gaps in existing research, and flagging promising areas for future study.

joint european software engineering conference, requirement engineering, software engineering conference and symposium, (12 more...)

arXiv.org Artificial Intelligence

2308.1062

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Promising Solution (0.92)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhanced Fairness Testing via Generating Effective Initial Individual Discriminatory Instances

Ma, Minghua, Tian, Zhao, Hort, Max, Sarro, Federica, Zhang, Hongyu, Lin, Qingwei, Zhang, Dongmei

arXiv.org Artificial IntelligenceSep-17-2022

Fairness testing aims at mitigating unintended discrimination in the decision-making process of data-driven AI systems. Individual discrimination may occur when an AI model makes different decisions for two distinct individuals who are distinguishable solely according to protected attributes, such as age and race. Such instances reveal biased AI behaviour, and are called Individual Discriminatory Instances (IDIs). In this paper, we propose an approach for the selection of the initial seeds to generate IDIs for fairness testing. Previous studies mainly used random initial seeds to this end. However this phase is crucial, as these seeds are the basis of the follow-up IDIs generation. We dubbed our proposed seed selection approach I&D. It generates a large number of initial IDIs exhibiting a great diversity, aiming at improving the overall performance of fairness testing. Our empirical study reveal that I&D is able to produce a larger number of IDIs with respect to four state-of-the-art seed generation approaches, generating 1.68X more IDIs on average. Moreover, we compare the use of I&D to train machine learning models and find that using I&D reduces the number of remaining IDIs by 29% when compared to the state-of-the-art, thus indicating that I&D is effective for improving model fairness

artificial intelligence, idis, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2209.08321

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > China > Beijing > Beijing (0.04)
Oceania > Australia > New South Wales > Callaghan (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback